| H0 is true | H0 is false | |
| Accept H0 | Type II error, miss | |
| Reject H0 | Type I error, false alarm |
Eva Freyhult
NBIS, SciLifeLab
2022-09-13
Statistical inference is to draw conclusions regarding properties of a population based on observations of a random sample from the population.
To perform a hypothesis test is to evaluate a hypothesis based on a random sample.
Typically, the hypotheses that are tested are assumptions about properties of a population, such as proportion, mean, mean difference, variance etc.
There are two hypotheses involved in a hypothesis test, the null hypothesis, \(H_0\), and the alternative hypothesis, \(H_1\).
\(H_0\), the null hypothesis is in general neutral, “no change”, “no difference between groups”, “no association”.
In general we want to show that \(H_0\) is false.
\(H_1\), The alternative hypothesis expresses what the researcher is interested in “the treatment has an effect”, “there is a difference between groups”, “there is an association”.
The alternative hypothesis can also be directional “the treatment has a positive effect”.
A sampling distribution is the distribution of a sample statistic. The sampling distribution can be obtained by drawing a large number of samples from a specific population.
The null distribution is a sampling distribution when the null hypothesis is true.
A null distribution
The p-value is the probability of the observed value, or something more extreme, if the null hypothesis is true.
The p-value is the probability of the observed value, or something more extreme, if the null hypothesis is true.
| H0 is true | H0 is false | |
| Accept H0 | Type II error, miss | |
| Reject H0 | Type I error, false alarm |
The significance level, \(\alpha\) = P(false alarm) = P(Reject \(H_0\)| \(H_0\) is true).
The significance level should be set before the hypothesis test is performed!
Common values of \(\alpha\) are 0.05 or 0.01.
If the p-value is above the significance level, \(H_0\) is accepted.
If the p-value is below the significance level, \(H_0\) is rejected.
Do high fat diet lead to increased body weight?
Study setup:
The observed values, mouse weights in grams, are summarized below;
| high-fat | 25 | 30 | 23 | 18 | 31 | 24 | 39 | 26 | 36 | 29 | 23 | 32 |
| ordinary | 27 | 25 | 22 | 23 | 25 | 37 | 24 | 26 | 21 | 26 | 30 | 24 |
1. Null and alternative hypotheses
\[ \begin{aligned} H_0: \mu_2 = \mu_1 \iff \mu_2 - \mu_1 = 0\\ H_1: \mu_2>\mu_1 \iff \mu_2-\mu_1 > 0 \end{aligned} \]
where \(\mu_2\) is the (unknown) mean body weight of the high-fat mouse population and \(\mu_1\) is the mean body-weight of the control mouse population.
Studied population: Female mice that can be ordered from a lab.
2. Select appropriate significance level \(\alpha\)
\[\alpha = 0.05\]
3. Test statistic
Of interest; the mean difference between high-fat and control mice
\[D = \bar X_2 - \bar X_1\]
Mean weight of 12 (randomly selected) mice on ordinary diet, \(\bar X_1\). \(E[\bar X_1] = E[X_1] = \mu_1\)
Mean weight of 12 (randomly selected) mice on high-fat diet, \(\bar X_2\). \(E[\bar X_2] = E[X_2] = \mu_2\)
Observed values;
Mean weight of control mice (ordinary diet): \(\bar x_1 = 25.83\)
Mean weight of mice on high-fat diet: \(\bar x_2 = 28.00\)
Difference in mean weights: \(d_{obs} = \bar x_2 - \bar x_1 = 2.1667\)
4. Null distribution
If high-fat diet has no effect, i.e. if \(H_0\) was true, the result would be as if all mice were given the same diet.
The 24 mice were initially from the same population, depending on how the mice are randomly assigned to high-fat and normal group, the mean weights would differ, even if the two groups were treated the same.
Random reassignment to two groups can be accomplished using permutation.
Assume \(H_0\) is true, i.e. assume all mice are equivalent and
If we repeat 1-2 many times we get the sampling distribution when \(H_0\) is true, the so called null distribution, of difference in mean weights.
4. Null distribution
5. Compute p-value
What is the probability to get an at least as extreme mean difference as our observed value, \(d_{obs}\), if \(H_0\) was true?
\(P(\bar X_2 - \bar X_2 \geq d_{obs} | H_0) =\) 0.17
6. Conclusion?